Unitary Operators on the Document Space
نویسنده
چکیده
When people search for documents, they eventually want content, not words. Hence, search engines should relate documents more by their underlying concepts than by the words they contain. One promising technique to do so is Latent Semantic Indexing (LSI). LSI dramatically reduces the dimension of the document space by mapping it into a space spanned by conceptual indices. Empirically, the number of concepts that can represent the documents are far fewer than the great variety of words in the textual representation. Although this almost obviates the problem of lexical matching, the mapping incurs a high computational cost compared to document parsing, indexing, query matching, and updating. This article accomplishes several things. First, it shows how the technique underlying LSI is just one example of a unitary operator, for which there are computationally more attractive alternatives. Second, it proposes the Haar transform as such an alternative, as it is memory efficient, and can be computed in linear to sublinear time. Third, it generalizes LSI by a multiresolution representation of the document space. The approach not only preserves the advantages of LSI at drastically reduced computational costs, it also opens a spectrum of possibilities for new research.
منابع مشابه
A characterization of orthogonality preserving operators
In this paper, we characterize the class of orthogonality preserving operators on an infinite-dimensional Hilbert space $H$ as scalar multiples of unitary operators between $H$ and some closed subspaces of $H$. We show that any circle (centered at the origin) is the spectrum of an orthogonality preserving operator. Also, we prove that every compact normal operator is a strongly orthogo...
متن کاملCoherent Frames
Frames which can be generated by the action of some operators (e.g. translation, dilation, modulation, ...) on a single element $f$ in a Hilbert space, called coherent frames. In this paper, we introduce a class of continuous frames in a Hilbert space $mathcal{H}$ which is indexed by some locally compact group $G$, equipped with its left Haar measure. These frames are obtained as the orbits of ...
متن کاملMulti-Frame Vectors for Unitary Systems in Hilbert $C^{*}$-modules
In this paper, we focus on the structured multi-frame vectors in Hilbert $C^*$-modules. More precisely, it will be shown that the set of all complete multi-frame vectors for a unitary system can be parameterized by the set of all surjective operators, in the local commutant. Similar results hold for the set of all complete wandering vectors and complete multi-Riesz vectors, when the surjective ...
متن کاملExtension of the Hilbert Space by J-Unitary Transformations
A theory of non-unitary unbounded similarity transformation operators is developed. To this end the class of J-unitary operators U is introduced. These operators are similar to unitary operators in their algebraic aspects but differ in their topological properties. It is shown how J-unitary operators are related to so-called J-biorthonormal systems and J-selfadjoint projections. Families {Uα} o...
متن کاملA note on $lambda$-Aluthge transforms of operators
Let $A=U|A|$ be the polar decomposition of an operator $A$ on a Hilbert space $mathscr{H}$ and $lambdain(0,1)$. The $lambda$-Aluthge transform of $A$ is defined by $tilde{A}_lambda:=|A|^lambda U|A|^{1-lambda}$. In this paper we show that emph{i}) when $mathscr{N}(|A|)=0$, $A$ is self-adjoint if and only if so is $tilde{A}_lambda$ for some $lambdaneq{1over2}$. Also $A$ is self adjoint if and onl...
متن کاملThe Sum of Unitary Similarity Orbits Containing Only Special Operators
Dedicated to Professor Shmuel Friedland. Abstract Let B(H) be the algebra of bounded linear operator acting on a Hilbert space H (over the complex or real field). Characterization is given to A1, . . . , Ak ∈ B(H) such that for any unitary operators U1, . . . , Uk, ∑k j=1 U ∗ j AjUj is always in a special class S of operators such as normal operators, self-adjoint operators, unitary operators. ...
متن کامل